A method for simultaneous variable selection and outlier identification in linear regression*

نویسنده

  • Jennifer Hoeting
چکیده

We suggest a method for simultaneous variable selection and outlier identification based on the computation of posterior model probabilities. This avoids the problem that the model you select depends upon the order in which variable selection and outlier identification are carried out. Our method can find multiple outliers and appears to be successful in identifying masked outliers. We also address the problem of model uncertainty via Bayesian model averaging. For problems where the number of models is large, we suggest a Markov chain Monte Carlo approach to approximate the Bayesian model average over the space of all possible variables and outliers under consideration. Software for implementing this approach is described. In an example, we show that model averaging via simultaneous variable selection and outlier identification improves predictive performance and provides more accurate prediction intervals as compared to any single model that might reasonably be selected. Keywords" Bayesian model averaging; Markov chain Monte Carlo model composition; Masking; Model uncertainty; Posterior model probability

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A diagnostic method for simultaneous feature selection and outlier identification in linear regression

A diagnostic method along the lines of forward search is proposed to simultaneously study the effect of individual observations and features on the inferences made in linear regression. The method operates by appending dummy variables to the data matrix and performing backward selection on the augmented matrix. It outputs sequences of feature–outlier combinations which can be evaluated by plots...

متن کامل

Simultaneous robust estimation of multi-response surfaces in the presence of outliers

A robust approach should be considered when estimating regression coefficients in multi-response problems. Many models are derived from the least squares method. Because the presence of outlier data is unavoidable in most real cases and because the least squares method is sensitive to these types of points, robust regression approaches appear to be a more reliable and suitable method for addres...

متن کامل

A statistical test for outlier identification in data envelopment analysis

In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...

متن کامل

Analysis of a Problem Using Various Visions

 In this paper an applied problem, where the response of interest is the number of success in a specific experiment, is considered and by various visions is studied. The effects of outlier values of response on results of a regression analysis are so important to be studied. For this reason, using diagnostic methods, outlier response values are recognized. It is shown that use of arc-sine ...

متن کامل

Comparison Of Hyperbolic And Constant Width Simultaneous Confidence Bands in Multiple Linear Regression Under MVCS Criterion

‎A simultaneous confidence band gives useful information on the reasonable range of the unknown regression model‎. ‎In this note‎, ‎when the predictor variables are constrained to a special ellipsoidal region‎, ‎hyperbolic and constant width confidence bonds for a multiple linear regression model are compared under the minimum volome confidence set (MVCS) criterion‎. ‎The size of one speical an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996